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Q -1 ! A new set of estimating equations is proposed to estimate the single-index co- 



in this paper, we consider a single-index mixed model with longitudinal data. 



efficient. The link function is estimated by using the local linear smoothing. 
Asymptotic normality is established for the proposed estimators. Also, the 
estimator of the link function achieves optimal convergence rates; and the esti- 
mators of variance components have root-n consistency. These results facilitate 
the construction of confidence regions/intervals and hypothesis testing for the 
parameters of interest. Some simulations and an application to real data are 
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1 Introduction 



Consider the single-index mixed model 

Y ij = g( X ij/3o) + ®i + i = j = l,...,m, (1.1) 

where ctj and eij are independent mean zero random variables with variances (y\ > and a 2 > 
0, respectively, g(-) is an unknown link function, and f3 is a p x 1 vector of unknown param- 
eters. For the sake of identifiability, it is often assumed that \\/3 \\ = 1 and the first nonzero 
component of /3 is positive, where ||-|| denotes the Euclidean metric. Let Y { = (Yn, . . . , Y im ) T , 
X t = (X a , X im ) T , e t = (ea, . . . , e im ) T and G(X^ ) = (g(X^ ), . . . , g{X?M) T . The 
model implies that the Yi are independent with E(Yi\Xi) = G(Xij3o) and cov(Yi\Xi) = V = 
a 2 a V m \ I ni + cr|I m , where l m is an m x 1 vector of ones and I m is the m x m identity matrix. 

We address the general problem of estimating the parameter (3 , the function g(-), and 
the variance components and a\ simultaneously when m is fixed. We will show in Section 
3 that the variance components a 2 a and a 2 e can be estimated at the parametric rate Op(n -1 ' 2 ) 
which allows us to treat them as known when we derive the theoretical results for (3 and 
g(-) in Sections 2 and 3. 

The single-index model is an important tool in multivariate nonparametric regression, 
which can avoid the so-called "curse of dimensionality" by searching a univariate index of 
the multivariate covariate X to capture important features of high-dimensional data. The 
single-index model has been applied in a variety of fields, such as discrete choice analysis 
in econometrics and dose-response models in biometrics (Hardle et al. 1993). In the cross- 
sectional data, many authors have studied the statistical inference problem of the single- index 
model, and reported many results, for example, Li (1991), Ichimura (1993), Zhu and Ng 
(1995), Xia and Li (1999), Naik and Tsai (2000), Hristache, Juditsky and Spokoiny (2001), 
Xia et al. (2002), Stute and Zhu (2005), Xia (2006), and Xue and Zhu (2006). Meanwhile, 
the estimation problem of the partially linear single-index model has been widely addressed 
as well by Carroll et al. (1997), Yu and Ruppert (2002), Xia and Hardle (2006), Zhu and 
Xue (2006), Wang et al. (2010) and others. These reported methods have been proven to 
be useful and effective for the independent data. On the other hand, to our knowledge, the 
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method to treat correlated data, which are commonly seen in econometrics and biometrics, 
is lacking in literature. In this paper, such models will be developed and reported. 

Longitudinal data are perhaps the most well-known type of correlated data. There are 
already extensive literature on the generalized linear, nonparametric and semiparametric 
mixed models for longitudinal data, see, for example, Zeger and Diggle (1994), Jiang (1998), 
Zhang, et al. (1998), Jiang (1999), Ruckstuhl, Welsh and Carroll (2000), Jiang and Zhang 
(2001), Jiang, Jia and Chen (2001), Ke and Wang (2001), Cai, Cheng and Wei (2002), 
Wu and Zhang (2002), Liang, Wu and Carroll (2003), Zhang and Lin (2003), Gu and Ma 
(2005), Hall and Maiti (2006), Jiang (2006) and Field, Pang and Welsh (2008), among 
others. However, literature on the applications of single- index models for longitudinal/panel 
data is limited. Honora and Kyriazidou (2000) and Carro (2007) proposed some estimating 
methods for dynamic panel data discrete choice models. Bai et al. (2009) studied the 
single-index models for longitudinal data, where they proposed a procedure to estimate the 
single-index component and the link function based on the combination of the penalized 
splines and quadratic inference functions. Liang and Zeger (1986) proposed an extension 
of the generalized linear models to the analysis of longitudinal data. They introduced the 
generalized estimating equations (GEE) that gave consistent estimates of the regression 
parameters and their variance under mild assumptions on the time dependence. The GEE 
were derived without specifying the joint distribution of a subject's observations yet they 
reduced to the score equations for multivariate Gaussian outcomes. In this paper, we apply 
the idea of GEE to the single-index mixed models with longitudinal data. To estimate 
the single-index coefficient f3 , we propose a new set of estimating equations which take 
the constraint \\Po\\ = 1 hito account. The estimator based on these estimating equations 
outperform previous ones, as summarized below. First, our estimation procedure does not 
specify a form for both the distribution of random effect and the joint distribution of the 
repeated measurements. Second, we introduce estimating equations that give the root- 
n consistent estimate of (3 under week assumptions on the joint distribution. Third, we 
construct the root-n consistent estimates of the variance components a 2 e and cr„. It allows 
us to consider the construction of confidence regions and hypothesis testing for (3q. Lastly, 
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we also obtain the asymptotic normality and the uniform convergence rate of the estimator 
of g(-). Our algorithm is numerically fast and stable. 

The rest of the paper is organized as following. In Section 2, we elaborate on the method- 
ology. Section 3 presents the asymptotic properties for all proposed estimators. Section 4 
reports the results of simulation studies and one real example. The proofs of the main 
theorems are relegated to the Appendix. 

2 Estimation method 

2.1 Estimations of the parametric and nonparametric components 

If g were known, we could estimate (3$ by minimizing 

1 n 

R n ((3) = -Y,{Yi - G(X,/3)} T Vy(X l /3)\/- 1 {^ - G(X,/3)} 
n i= i 

for j3 with \\[3\\ = 1, where W(Xij3) = diag{w(Xnj3), . . . ,w(X im j3)}, and w(-) is a bounded 
weight function with a bounded support U w , which is introduced to control the boundary 
effect. For simplicity and convenience, we assume that dw{u)/du = 0. Especially, we can take 
w(-) = /[_„_„](•), for some constant a > 0. This is a restricted least squares problem. We now 
use the constraint \\f3o\\ = 1 to transfer the restricted least squares to the unrestricted least 
squares, which makes it possible to search for the solution of the estimating equations over 
a restricted region in the Euclidean space i? p_1 . For this, we need to calculate the derivative 
of g(Xfj/3) at point f3 . Note that \\/3 \\ = 1 

means that the true value /3 is the boundary 
point of the unit sphere. The function g(Xf j /3) does not have the derivative at point f3 . 
For this, we suggest the popularly used delete-one-component method (Wang et al., 2010). 
The detail is as follows. Without loss of generality, we may assume that the true parameter 
/3q has a positive component (otherwise, consider — /3 ), say f3 0r > for (3 = (/3 i, • • • , A)p) T 
and 1 < r < p. For (5 = (ft, . . . , /3 P ) T , let (3^ = (ft, . . . , /3 r _ 1; f3 r+u . . . , f3 p f be a p - 1 
dimensional parameter vector after removing the rth component (3 r in (3. Then the true 
parameter (3^ must satisfy the constraint ||/3o^|| < 1> an d (3 is infinitely differentiable in a 
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neighborhood of (3q . The Jacobian matrix of f3 with respect to (5^ is defined as 

d(3 T 
V> = -Q^ir) = (7i,---,7p) , (2.1) 

where 7 S (1 < s < p, s ^ r) is a p— 1 dimensional unit vector with sth component 1, and 7 r = 

— (1 — ll/^^ll 2 ) 1 ^ 2 P^- Let Xij = (Xiji, . . . ,Xijp) T and = (X^i, . . . , X^^^, JQj( r+ i), . . . , Xjj P ) T . 

Then we have = X^ )T f3^ + (1 - H/^H 2 ) 1 / 2 ^, which is a function of When g 

is known, we can obtain an estimator of ffi by solving 

1 ™ 

Q n (G, fiV) = - ^ Jj w Xf G / A (X,/3)^(X,/3)\/- 1 {^ " G(X,/3)} = (2.2) 
n i=i 

for p( r \ where G' A (Xi/3) = dia,g{g'(Xnp), . . . , g'(X im /3)}. An iteratively reweighted least 
squares algorithm is widely used for solving this system of equations. Given a current 
estimate P^ with ||/3o^|| = 1, compute 

^ r) ='^ ) + B-\G^ ) )Qn{G^ ) ) 
and p^ = /3 (r) /||/3 (r) ||, where 

1 n 

B n (G,P^) ^-Y.J^XfG'liX^WiX^V-'X,,!^. 
n i=i 

This iteratively reweighted least squares algorithm solves (2.2) and is identical to the Fisher's 
method of scoring version of the Newton-Raphson algorithm for solving these estimating 
equations. Using ||/3 || = 1 and \\(3\\ = 1, we can prove 

P - A) = J p (r)(P {r) - P [r) ) + Opin- 1 ). 

Thus, we can obtain an iterative formula for estimating (3 when g is known, that is 

P* = P + J^B-\G, (3)Q n (G, P) (2.3) 

and P* = P*/\\P*\\, where the initial value of P , say \\j3 \\ = 1, can be obtain by fitting the 
linear model. Then, set P = (5* and iterate until convergence. 

Since we assume that the link function g is unknown, it must be estimated. Given an 
initial estimate P of p , we can easily compute a nonparametric estimates g and g' of g and 
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g'(u). We employ the local linear smoother (Fan and Gijbels, 1996) to obtain estimators of 
the link function g and its derivative g' . Specifically, for a kernel function K(-) on the real 
set R 1 and a bandwidth sequence h = h n tending to 0, define Kh{ ) = h^ 1 K(-/h). For a 
fixed (3, the local linear smoother aims at minimizing the weighted sum of squares 

n m 

E EOtf - d o- di(X^ - u)} 2 K h (X^ - u) (2.4) 

i=l j=l 

with respect to the parameters d v , u — 0,1. Let d and d\ be the solutions to the weighted 
least squares problem (2.4). The local linear estimators for g{u) and g'(u) are defined as 
g(u; Po) = do and g'{u\ Po) = d\ at the fixed point P$. It follows from the theory of least 
squares that 



where 



and 



with 



(g(u; ft), hg\u- p Q )) = S^(u; A>), (2-5) 



S n (u;p ) = 



S n ,o(u;P ) S nA (u;p ) 
y S nA (u;P ) S n , 2 (u;P ) 



£n(u;P ) = (£ nfi (u;p ), £„,i(u; Po)Y 



SnAu; Po) = I E E K^Po - u) (2.6) 

n i=l 3=1 \ n I 



and 



1 n m / vT o n 

Ui(u; /?o) = ^EE *y ) - u) (2.7) 

n i=l j=l \ n J 

for I = 0,1,2. 

The estimator g is called pooled estimator in existing literatures, for example Lin and 
Carroll (2000), Ruckstuhl, Welsh and Carroll (2000), and Xue (2010). As pointed out 
in these literatures, the simple pooled estimator which ignores the dependence structure 
performs very well asymptotically. 

When g is unknown, we can also obtain an estimator of P^ by solving the estimating 
equations Q n (G,P^) = 0, where G(X^ ) = {g{XfM, . . . , s(X^/3 )) T and G' A (X t P) = 



diag{5' / (Xj 1 /3), . . . ,g'(X im (3)} for i = l,...,n. We propose the use of an alternating algo- 
rithm; first estimating (3 , and then the link function g, repeating these until certain criterion 
is met. Given g and g', we use the scoring algorithm (2.3) to estimate j3o, that is 

^ = (3 + J^ ) B- 1 (Gj)Q n (Gj) (2.8) 

and ft = given the estimate of (3 , we used the pooled estimate (2.5) to get a new 

estimate of the link function g. 

With /3, the final estimator of g can be defined by g*(u) = g(u; j3). The asymptotic result 
for the estimate of link function g follows from Theorems 1 and 2, and the result for the 
estimate of parameter (3q is established in Theorem 3. 

Remark 1 We consider a homoscedastic model of (1.1). While the estimation procedure 
can be extended to heteroscedastic errors. In addition, the single-index assumption in (1.1) 
can be readly extended to multiple indices through Sliced Inverse Regression (SIR) or its 
variants, but the estimation of the multivariate link function g would encounter the curse of 
high dimensionality. In many applications, since no more than three indices will be needed, 
the approach in this paper can indeed be extended in practice to multiple indices. 

2.2 Estimations of the variance components 

The estimation of the nonparametric component and the asymptotic variances of all the 
estimators depends on the variance components, thus we need to exhibit consistent estimators 
of the variance components. 

A useful approach to estimate the variance components is to pretend that the residuals are 
of mean zero and have the covariance matrix same as if g(-) were known. If we assume that 
the random effects ctj and the errors are Gaussianly distributed, then the observation Y i 
have independent N(G(X i f3 ), V) distributions. Replacing g(-) and (3 with their estimators 
g(-) and j3, respectively, the Gaussian "likelihood" for of and a\ can be written as 

in n - - 
-n(m - 1) log(a £ 2 ) - n log(cr £ 2 + ma 2 a ) - — YJX% ~ 9 if 

a e i=lj=l 
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where Y { = m 1 J2]Li Yij and g { = rrT 1 Y^=\ 9{Xjj$). This "likelihood" is maximized at 

1 Yl Yfl 

-I = ^3T) £ £ - - " , (2.9) 

*2 = ^ E(S - &) 2 - a*/m, (2-10) 

ri i=i 

when of > 0, and at of, = and 

-N^££{^-^)} , (2.H) 
u " l i=i j=i 

otherwise. It can be shown that the resulting estimators have the same convergence rate as 
if g(-) and /3q actually were known. The result will be given in next section. 

Alternatively, we can abandon the "likelihood" and employ a method of moments device 
to get the estimators (2.9)-(2.11). We can also adjust for the loss of degrees of freedom due 
to estimating g(-), and obtain the estimators of o\ and a\. The details can be found in 
Ruckstuhl, et al. (2000). 

3 Main Results 

We now study the asymptotic behavior of the estimators for the nonparametric com- 
ponent g as well as the parametric components j3o, o 2 a and a 2 . We first list the following 
regularity conditions: 

(CI) The joint density of (X^_j3, . . . ,X^ m f3) T exists, the marginal density fj(u) of Xjfi and 
the joint density /j U - 2 (u, s) of X^ 2 j3), for any j 1 ^ j 2 , are continuously differen- 

tiable for u G U w and (u , s ) EU w x U w , respectively, and there exists a j such that 
fj(u) is bounded away from 0, uniformly for u G 1A W and j3 near /3 , where U w is the 
support of w(u). 

(C2) The function g(u) has two bounded and continuous derivatives, and g2r(u) satisfies a 
Lipschitz condition of order 1 on U w , where g2r(u) is the rth component of g2(u), and 
g 2 (u) = E(X tl \X^ = u),l<r<p. 

(C3) The kernel K(-) is a bounded and symmetric probability density function with bounded 
support, and satisfies the Lipschitz condition of order 1 and ju 2 K{u)du ^ 0. 
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(C4) There exists an r = max{4, s} such that E(\Xij\ r ) < oo, E(\a,i\ r ) < oo and E(\Eij\ r ) < 

oo, and for some e < 2 — s^ 1 such that n 2e ^h — >■ oo, i = 1, . . . , j = 1 . . . , m. 
(C5) nh 3 / log(l//i) oo and n/i 4 — >■ as n — )■ oo. 

(C6) B = E\j T (r) XfG' 2 (X l /3 )W(X 1 /3 )V~ 1 X 1 J R(r) ] is a positive definite matrix. 

Remark 2 Condition (CI) ensures that the denominators of g(u; (5) and g'(u; (5) are, 
with high probability, bounded away from for t G U w and /3 near (3q. (C2) is the standard 
smoothness condition. (C3) is the usual assumption for second-order kernels. (C4) is a 
necessary condition for the asymptotic normality of an estimator. (C5) is the usual condition 
for bandwidth. (C6) ensures that the limiting variances for the estimator (3 exist. 

Let £>„ = {/3 E B : \\/3 — f3 \\ < cin^ 1 / 2 } for some positive constant c\. The definition is 
motivated by the fact that, since we anticipate that (3 is root-n consistent, we should look 
for a solution of the equations Q n (g, (3^) = which involves (3^ distant from (3q ^ by order 
rT 1 ! 2 . Similar restriction was also made by Hardle, Hall and Ichimura (1993) and Xia and 
Li (1999). Denote ^ = ju l K{u)du and v x = JK l {u)du, 1 = 1,2. 

The following theorems state the asymptotic behavior of the estimators proposed in 
Section 2. We first give the uniform convergence rates for the estimators g and g' respectively. 



Theorem 1 Suppose that conditions (C1)-(C4) hold. Then 

sup \g(u; (3) - g{u)\ = P ((nh/ logn)" 1 / 2 + h 2 ) 

u£U w ,/3eB n v y 



and 



sup \g\u; [3] - g\u)\ = P ( (nh 3 / log n)^ 2 + h) 



The following Theorem 2 shows the asymptotic normality of estimator g. 
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Theorem 2 Suppose that conditions (C1)-(C4) hold. If nh 5 = 0(1), then for any 
u eU w and /3 such that \\/3 — /3q\\ = Op{n~ l l 2 ), we have 



nh{g(u; (3) - g(u) - b(u)} ^ N(0, a 2 {u)). 

where h{u) = (l/2)h 2 ^ 2 g"(u), and a 2 (u) = (a 2 a + a 2 )u / E™=i /»■ 
If further assume that nh 5 — > 0, then 



nh{g(u;P)-g(u)}^N(0,o- 2 (u)). 



In Theorems 1 and 2, when we start with A/n-consistent estimator for /3 , g has uniform 
convergence rate and asymptotic normality. Numerous examples of v/n-consistent estimators 
already exist in the literature. For instance, Hall (1989) showed that one can obtain a \fn- 
consistent estimator for /3 using projection pursuit regression. Under the linearity condition 
that is slightly weaker than elliptical symmetry of X, Li (1991), Hsing and Carroll (1992) 
and Zhu and Ng (1995) proved that SIR, proposed by Li (1991), leads to a v/n-consistent 
estimator of f3 . Xia et al. (2002) proposed the minimum average variance estimation 
(MAVE) and Xia (2006) proposed a refined version of MAVE, and both methods can provide 
\/n-consistent estimators for the single- index (3 . 



Theorem 3 Suppose that conditions (C1)-(C6) hold. If the rth component of (3 is 
positive, then 



V^0 -P )^N (0, J^B-'AB- 1 ^ , 



where A = E 



with 



JJ^Xi - G 1 (X 1 f3 )} T G' 2 (X 1 f3 )W 2 (X 1 (3 )V- 1 {X 1 - G 1 (X 1 f3 )}J {r) 
Gi(X 1 (3 ) = (gi(Inft), • • • , gi(X lm (3 )) T and g^u) = E(X lj \Xj j P = u), and B is defined 
in condition (C6). 



From Theorems 2 and 3, we can obtain the following corollary 1. 
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Corollary 1 Suppose that conditions (C1)-(C6) hold. Then, for any u G U w , 

VnH{g*(u) - g(u)} ^ N(0,a 2 (u)). 
where c 2 (u) is defined in Theorem 2. 

From Theorem 3, we obtain an asymptotic result regarding the angle between (3 and f3 , 
which can be used to study issues of sufficient dimension reduction. 

Corollary 2 Suppose that the conditions of Theorem 3 hold. Then 

T (3 O \ - 1 = Opin- 1 ' 2 ), 

where \(3 T fto\ is the absolute inner product. Their inner product represents the cosine of the 
angle between the two directions. 

The following theorem provides the convergence rates of the estimators of a 2 and a 2 , 
respectively. 

Theorem 4 Suppose that conditions (C1)-(C6) hold. Then 

a 2 - a 2 = P (n-^) , 
= Op (n^ 2 ) . 

To construct confidence regions for (3 , a plug-in estimator of the limiting variance of 
/3 is needed. We define the following estimators B and A of B and A, respectively, by 
B = B n (gJ) and 

1 n 

n i=i 
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where V = a 2 a l m l T m + a% G A (XjJ) = di&g{g'(XlJ- $),..., g'{Xj m p; 0)}, G^XjJ) = 
(gi(XTj; /?),..., g^XfJ- P)) T with gi (u; 0) = E"=i £™=i W nij (u; P)X^ which is the esti- 
mator of gi(u) = E(Xij\XjjP = u), 



W nij (u; p) 



rr^K h {XT.p - u){S n>2 {u; g) - (Xp - u)S n>1 ( U ; g)} 
S nfi {u-J)S nfl (u-J) - ^(u-J) 



and S n ^{u\P) is defined in (2.6). It is easy to prove that Jg( r) — ^> J^m, -B — -B and 
A — > A. Then for any p x / matrix H of full rank with I < p, Theorem 2 implies that 

(n~ 1 H T Jp (r) B- 1 AB-\Jl r) H)~ 1 ' 2 H T - fa) ^ N(0,I{). 

We use Theorem 10. 2d in Arnold (1981) to obtain the following limiting distribution. 



Theorem 5 Suppose that the conditions of Theorem 2 hold. Then 

- (3 ) T H {n- 1 H T Jp {r) B- 1 AB- 1 ^ r) H)~ 1 H T - p ) -A xl 

Theorem 5 can be used to construct the large sample confidence region or interval for the 
parameter f3 Q . 

Applying Corollary 1, we can construct pointwise confidence interval for g(uo) at a fixed 
point u e U w . However, we need to use the plug-in estimators for the asymptotic bias and 
covariance. Obviously, the asymptotic bias and covariance of g(uo) are dependent on a 2 , o 2 a 
and fj(u ). a 2 and a 2 a have been estimated in (2.9) and (2.10). The estimator of fj(u ), 
j — 1, . . . , to, is defined by 

1 n 

fAu ) = —Y,mx l3 -u,)/h). 

Thus, we can derive a 2 (u ) by replacing fj(u ), a 2 and a 2 by their consistent estimators 
fj(u ), a 2 and a 2 respectively. Therefore, a 2 (u ) is a consistent estimator of a 2 {u ). By 
Corollary 1, we have 

Vnh{g(u ; $) - g(u )}/a(u ) ^ N(0, 1) 
Using above result, we can obtain an approximate 1 — a confidence interval for g(u ). 
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4 Concluding remarks 



In this paper we have investigated the inference of single-index mixed models with lon- 
gitudinal data. We use local linear regression smoothing to estimate the link function, and 
use the generalized estimating equations to estimate the parametric components. We also 
construct the estimators of the variance components. The proposed method avoids the need 
for multivariate distribution by only assuming a functional form on the marginal distribution 
for each time. The covariance structure across time is treated as a nuisance. A key feature 
of our approach is that we transform a restricted least squares problem to an unrestricted 
least squares problem by solving the estimating equations to estimate the parametric com- 
ponents. The asymptotic variance of our estimator for parametric components is the same 
as that obtained by Wang et al. (2010) in pure single-index models. 

In longitudinal studies, sometimes the covariance structure is very complex; that is, the 
covariance matrix of outcome variable may be of a general form, allowing V to have \m{m— 1) 
parameters. Our method can be extended to study this type of problem. In particular, the 
estimators obtained using our method will be efficient only if the observations on a subject 
are independent. The estimating equations described in this paper can be considered as 
an extension of the quasi-likelihood to the case where the second moment cannot be fully 
specified in terms of the expectation but rather additional correlation parameters must be 
estimated. It is the independence across subjects that allows us to consistently estimate 
these nuisance parameters where this could not be done otherwise. 
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